A Database for Measuring Linguistic Information Content
نویسندگان
چکیده
Which languages convey the most information in a given amount of space? This is a question often asked of linguists, especially by engineers who often have some information theoretic measure of “information” in mind, but rarely define exactly how they would measure that information. The question is, in fact remarkably hard to answer, and many linguists consider it unanswerable. But it is a question that seems as if it ought to have an answer. If one had a database of close translations between a set of typologically diverse languages, with detailed marking of morphosyntactic and morphosemantic features, one could hope to quantify the differences between how these different languages convey information. Since no appropriate database exists we decided to construct one. The purpose of this paper is to present our work on the database, along with some preliminary results. We plan to release the dataset once
منابع مشابه
Integrating Balanced Scorecard with Fuzzy Linguistic and Fuzzy Delphi Method for Evaluating Performance of Team Sports (SANAT NAFT NOVIN Abadan Football Club)
<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; ba...
متن کاملIntegrating Balanced Scorecard with Fuzzy Linguistic and Fuzzy Delphi Method for Evaluating Performance of Team Sports (SANAT NAFT NOVIN Abadan Football Club)
<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; ba...
متن کاملDesign and Implementation of a Comprehensive Database of the Written Heritage of Science and Technology
Purpose: This study aims to design and implement a comprehensive database of the written heritage of science and technology in the Regional Information Center for Science and Technology (RICeST) and determine the metadata elements required to describe the manuscripts. Method: This study was carried out by the content analysis method to identify the metadata elements needed to describe the coll...
متن کاملMeasuring Customer Acquisition Value: A Comprehensive Approach to Customer Equity
In information technology era, databases are known asone of the most valuable resources for organizations, especially usedin database marketing. Customer Equity is a key concept in DatabaseMarketing which integrates customer acquisition, retention and development.From the perspective of customer equity, customers are theprimary source of both current and future cash-flows. Customer equitymodels...
متن کاملA Linguistic Analysis of the Online Debate on Vaccines and Use of Fora as Information Stations and Confirmation Niche
This study looks at the communication between users concerning health risks, with the aim of exploring their use of fora and assessing whether participants establish a niche with like-minded users during these exchanges. By integrating a corpus linguistic approach with content analysis and multiple studies on computer mediated health discourse, this study analyses the intense attention paid to ...
متن کامل